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Abstract 

We study the problem of how well a tree metric is able to preserve the sum of pairwise 
distances of an arbitrary metric. This problem is closely related to low-stretch metric 
embeddings and is interesting by its own flavor from the line of research proposed in the 
literature. 

As the structure of a tree imposes great constraints on the pairwise distances, any embed- 
ding of a metric into a tree metric is known to have maximum pairwise stretch of J7(logn). 
We show, however, from the perspective of average performance, there exist tree metrics 
which preserve the sum of pairwise distances of the given metric up to a small constant 
factor, for which we also show to be no worse than twice what we can possibly expect. The 
approach we use to tackle this problem is more direct compared to a previous result of @], 
and also leads to a provably better guarantee. Second, when the given metric is extracted 
from a Euclidean point set of finite dimension d, we show that there exist spanning trees 
of the given point set such that the sum of pairwise distances is preserved up to a constant 
which depends only on d. Both of our proofs are constructive. The main ingredient in our 
result is a special point-set decomposition which relates two seemingly-unrelated quantities. 

1 Introduction 

The problem of approximating a given metric by a metric which is structurally simpler has been 
a central issue to the theory of finite metric embedding and has been studied extensively in the 
past decades. A particularly simple metric of interest, which also favors from the algorithmic 
perspective, is a tree metric. By a tree metric we mean a metric induced by the shortest distances 
between pairs of points in a tree containing the given points. Generally we would require the 
distances in the given metric not to be underestimated in the target metric, which is crucial for 
most of the applications, and we would like to bound the increase of the distances, distortion, 
or stretch, from above. See [21 El [12]. On the other hand, a similar and equally important 
problem in network design is to find a tree spanning the network, represented by a graph, that 
provides a good approximation to the shortest path metric defined in the graph [21 [11] . 

Let M = (V, d) and M' = (V, d') be two metrics over the same point set V such that 
d'(u,v) > d(u,v) for all u, v G V. For each u, v G V, let stretch(u,v) = d'(u,v)/d(u,v) be the 
pairwise stretch, or distortion, between the pair u and v. Different notions have been suggested 
to quantify how well the distances of Ai are preserved in At', e.g., 
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1. Maximum pairwise stretch [15], defined by max u w6 y stretch(u, v), which is closely related 
to the extensively studied Spanner problems. 

2. Average pairwise stretch [2J Q3] , defined by (£„,„ e V stretch{u, u)) /('£') . 

3. Distance- weighted average stretch [I3l [THJ [23] , defined as 

l — — - ]T d(u, v) ■ stretch(u, v) = ^ u ' veV * ( "' ^ . 

This measure makes sense in real-time scenarios when it is less desirable and more costly 
to raise the distances of distant pairs than that of close pairs. For example, the effect of 
raising the delay of a pair from 2 seconds to 10 seconds is less tolerable than raising the 
delay of another pair from 20 ms to 100 ms. Throughout this paper we will also refer to 
the sum of pairwise distances as the routing cost following the terminology used in the 
literature. 

In this work, we address the problem of how well a tree is able to preserve the sum of 
pairwise distances, or, the distance-weighted average stretch, of an underlying metric. To be 
more precise, let M. = (V,d) and M! = (V',d') be two metrics. We say that M! dominates 
JVl if V 2 V an d for all u,v 6 V, we have d'(u,v) > d{u,v). We consider the following two 
problems. 

Problem 1. Let M = (V, d) be a given metric and T>(M) be the set of dominating tree metrics 
of M . What is 

{V'J%v{M) 52 u>vev d(u,v) 

Problem 2. Let V be a set of points in lZ d , \v^v\ be the straight-line distance between two 
points u, v € V, ST(V) be the set of spanning trees of V, and df be the distance function of 
T, for any T G ST(V). What is 

inf „ 1 | ? 

TeST(V) 2^ UjVeV \u,v\ 

We remark on Problem [2] that, although we can consider the Euclidean metric extracted 
from V as we did in Problem [TJ dominating tree metrics of it do not necessarily correspond to 
any spanning tree of V. In fact, if we apply the approaches for Problem [T] directly, the lack of 
balance guarantee in each partition can make the resulting pairwise distances arbitrary large. 

Embedding metrics into tree metrics was introduced in the context of probabilistic embed- 
ding by Alon et al., |5j. What follows was a series of notable work. Bartal [6] considered 
probabilistic embeddings and proved that any metric can be probabilistically approximated by 
tree metrics with expected maximum distortion 0(log 2 n). This result was later improved to 
O(lognloglogn) [7]. Bartal also observed that any probabilistic embedding into a tree has 
distortion at least f2(logn). This gap was closed by Fakcharoenphol et al., [12J, who showed 
that for any metric, there exists tree metrics with O(logn) distortion. 

Problem 3. Given a metric M = (V,d) and a weight function uu : V x V — > TZ + , find a 
dominating tree metric T of M such that veV w uv ■ dx(u, v) < a Y2 U v& v Wuv ' ^( n ' v )- 

As Charikar et al., jlOj showed by linear program duality that computing probabilistic 
embeddings of a given metric and Problem [3] described above are in fact dual problems, the 
series of work led by Bartal (6] [TJ \TT\ IT2] has provided improved approximation results for 
a large set of problems, including buy-at-bulk network design, vehicle routing, metric labeling, 
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group Steiner tree, Minimum cost communication network. Refer to [10] for more detail and 
applications. 

Kleinberg, Slivkins, and Wexler [14] initiated the study of partial embedding and scaling 
distortion, which can be regarded as embedding with relaxed guarantees. In a series of following 
work, Abraham et al., [H H] proved that any finite metric embeds probabilistically in a tree 
metric such that the distortion of (1 — e) portion of the pairs is bounded by 0(log for any 
< e < 1. They also observed a lower bound of f2(yl/e), which is closed by Abraham et al., 
in [2]. 

In particular, Abraham et al., [4] showed that any metric can be probabilistically embedded 
into a tree metric such that the ratio between the expected sum of pairwise distances is 0(log <£), 
where <I> is the effective aspect ratio of given distribution. This provides an upper-bound to 
Problem [T] we considered. However, the guarantee they provided is loose due to the constant 
inherited from the guarantee on scaling distortion. See also [HE1I2]- Rabinovich [IB] showed 
that it is possible to embed certain special graph metrics into real line such that distance- 
weighted average stretch is bounded by a constant. 

On the other hand, for approximating arbitrary graph metrics by their spanning trees, a 
simple Q(n) lower bound in terms of maximum stretch is known for n-cycles [IT]. Alon, Karp, 
Peleg, and West [5] considered a distribution over spanning trees and proved an upper bound of 
20(yi°gniogiogn) on eX p ec ted distortion. Elkin et al., [H] showed how a spanning tree with 
0(log 2 n log log n) average stretch (over the set of edges) can be computed in polynomial time. 
In terms of average pairwise stretch, Abraham et al., [2] showed the existence of a spanning 
tree such that, for any < e < 1, the distortion of an (1 — e) fraction of the pairs is bounded 
by 0(y/\/e). Note that this implies an 0(1) average pairwise stretch. Smid [TH] gave a simpler 
proof for this result when the metric is Euclidean. 

In terms of sum of pairwise distances in graphs (routing cost), Johnson et al., [13] showed 
that computing the spanning tree of minimum routing cost is NP-hard. Polynomial time ap- 
proximations as well as approximation schemes have been proposed by Wong [19] and Wu et 
al., [23]. Despite the efforts devoted, however, no general guarantees have been made on the 
ratio between the routing cost of the optimal spanning tree and that of the underlying graphs. 
Other reasonable variations have been considered as well, i.e., sum-requirement routing trees, 
product-requirement routing trees, and multi-sources routing trees [20 \ [22 ], [21] . 

Our Contribution In this work, we take a different approach to tackle Problem [T] directly 
and obtain a provably small upper-bound. Specifically, we adopt the notion of hierarchically 
well-separated trees (HSTs), introduced by Bartal [7] and Fakcharoenphol [12], and show that, 
for any given metric A4, there exists a 2-HST, Ai' , such that the distance- weighted average 
stretch of A4' is bounded by 14.24. The main ingredient of this result is a special point-set 
decomposition which relates two seemingly-unrelated quantities, namely, the diameter of the 
point set and the sum of pairwise distances between two separated subsets. 

If we do not require HSTs, it is also possible to apply our technique and construct the so- 
called ultra-metrics, which is introduced by Abraham [2J and Bartal [8], with a similar stretch, 
3.56. This provides a better and explicit guarantee than that provided in [4] (from > 64). For 
the negative side, we show that there exist metrics for which no dominating tree metrics can 
preserve the sum of pairwise distances to a factor better than 2. This shows that our result is 
within twice the best one can achieve. 

As a side-product, we prove the existence of spanning trees with O^d^fd) distance- weighted 
average stretch for any point set in Euclidean space K d . To this end, we use our point-set 
cutting lemma to decompose the points recursively. In order to guarantee a constant blow-up 
in the diameter of the spanning tree, however, instead of allowing arbitrary cuts, we show that it 
is always possible to make a balanced decomposition such that the diameters of the partitioned 
sets stay balanced. Our result provides a good guarantee when the dimension of the given 
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Euclidean graph is low, which is true for most communication network. Although it is possible 
to apply the framework of (3J [2] to obtain a spanning tree of constant distance- weighted average 
stretch, the constant hidden inside is huge (> 10 5 ) that makes it practically less useful. Both 
of our proofs are constructive. 

2 Preliminary 

First we define some notation that will be used throughout this paper. Let (M, d) be a finite 
metric space, where M is the set of vertices and d is the distance function. Without loss of 
generality, we shall assume that the smallest distance defined by d is strictly more than 1. Let 
X C M be a subset of M. The radius of X with respect to a specific element y G X is defined 
to be A y (X) = max^ e x d(y, z). The diameter of X is defined to be A.(X) = max^ e x A y (X). 
For any r > 0, an r-net decomposition of (M, d) is a partition of M into clusters, where each 
cluster, say C, has radius at most r with respect to a certain vertex u £ C. 

Definition 1 (Hierarchical net decomposition). Let (M,d) be a metric and 5 = |~log 2 A(M)~|. 
A hierarchical net decomposition of (M, d) is a sequence of 5 + 1 nested net decompositions 
Do, D%, . . . , D$ such that 

• D$ = {M} is the trivial partition that puts all vertices in a single cluster. 

• Di is a 2 l -net decomposition and a refinement of Di+\. 

A laminar family T C 2 A/ of a set M is a family of subsets of M such that for any A, B G P, 
we have either ACB, BC A, or A(~)B = (p. Clearly a hierarchical net decomposition 
defines a laminar family and naturally corresponds to a rooted tree, for which is referred to as 
a hierarchically well-separated tree (HST), as follows. Each set S in the laminar family is a 
node in the tree, and the children of the node corresponding to S are the nodes corresponding 
to maximal subsets of S in the family. 

The distance function on this tree is defined as follows. The links from a node S in D l to 
each of its children in the tree have length equal to 2* . This induces a distance function dx 
on M, where dx(u, v) is equal to the length of the shortest path distance in T from node u to 
node v. 

Definition 2 (Ultrametric) . An ultrametric M is a metric space (M, d) whose elements are the 
leaves of a rooted labelled tree T such that the following is met. Each node v G T is associated 
with a label £(v) > such that if u G T is a descendant of v then £(u) < £(v) and £(v) = if and 
only if v is a leaf node. The distance between leaves u, v G M is defined as d(u, v) = £(lca(u, v)), 
where lca(u, v) is the least common ancestor of u and v in T. 

Note that, under this definition, the metric extracted from a hierarchically well-separated 
tree is also an ultrametric. 

Definition 3 (Centripetal metric). Given a metric (M,d) and a vertex x G M, we define the 
centripetal metric (M,d x ) of (M,d) with respect to x as d x (u,v) = \d(u,x) — d(v,x)\. 

For any metric (X,d), we denote by TZd(X) = veX d(u, v) the sum of pairwise dis- 
tances over X. Let P, Q C X be subsets of X such that P fl Q = ^, we define 1Zd(P,Q) = 
J2uePv&Q ^( M > v ) t° De t ne sum of pairwise distances between P and Q. The subscripts d will 
be omitted when there is no confusion. Clearly, TZ(X) decomposes into TZ(P) + TZ{Q) +1Z(P, Q) 
when P and Q form a partition of X. 

Consider the Euclidean space of finite dimension d. A hyper-rectangle is defined to be the 
Cartesian product of d closed intervals, which we will denote by [ai, b{\ x [02, 62] X • • • x M- 
Given a hyper-rectangle R = [ai, 61] x [02, 62] x . . . x [a d , we denote by Ci{R) the side length 
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Figure 1: (a) An illustration of the centripetal metric with respect to a vertex u. (b) A 
hierarchical decomposition of the points. 

of R along the i th dimension, which is bi — a,, and C max (R) = maxi<j<^ d(R). For a point set 
S G lZ d , we define its bounding box, denoted by B(S), to be the smallest hyper-rectangle that 
contains S. 

3 Approximating Arbitrary Metrics 

Given a metric M = (V,d), we describe in this section how a tree metric with small constant 
distance-weighted average stretch can be computed. 

3.1 The Algorithm 

We describe an algorithm to decompose M and define a hierarchical net decomposition. The 
algorithm runs in 5 = |~log 2 A(V)] iterations. Initially, we have i = 5 and the trivial partition 
Dg = {M}. In each of the following iteration, we decrease the value of i by one and compute 
Di from Di + i as follows. 

For each non-singleton cluster in Dj+i, say V, we compute a 2*-cut decomposition C(V) of V 
by repeatedly decomposing V by the process described below until the diameter of each clusters 
in the refinement falls under 2\ 

Let Q be a cluster in the refinement of V such that A(Q) > 2\ We pick a vertex u € Q 
such that A U (Q) = A(Q). Then we consider the centripetal metric of Q with respect to u. 
Let vi, V2, ■ ■ ■ , v q be the set of vertices of Q such that d(u, v±) < d(u, V2) < • . . < d(u, v q ). For 

1 < i < q—1, we denote J2i<j<i J2i<k<q du(vj, vt) bylZC(i). Literally, lZC(i) corresponds to the 
sum of pairwise distances, or, the interaction, between {v±, V2, ■ ■ ■ , v{\ and {v j+i, fi+2, ■ • , v q }. 
Let p, 1 < p < q, be the index such that p ^ q -jic(p} ^ ^ s m i mm i ze d.We create a new cluster 
in the refinement of V containing the vertices {vi, V2, ■ ■ ■ , v p } and let Q ^— Q\{vi, V2, ■ ■ ■ , v p }. 
This process is repeated until all the clusters in the refinement of V have diameter less than 2\ 
Di is defined to be the union of the refinements of non-singleton clusters of -Dj+i. A high-level 
description of this algorithm can be found in the appendix. 

3.2 Analysis 

First we argue that the algorithm computes a dominating tree metric. Let T be the tree 
corresponding to the hierarchical net decomposition constructed by our algorithm and dx be 
the distance function induced by T. For any non-singleton cluster V in Di and u,v G V, we 
have d(u,v) < A('P) < 2* by the definition of hierarchical net decomposition, and dx{u,v) < 

2 • J2o<j<i 2 J < 2 i+1 by the construction of the tree metric. Therefore, (T, dr) is a dominating 
tree metric of M. 

In the following, we will show that 1Z(T) < 4- ^ ■IZ(M). To this end, we prove that, for any 
partition of a cluster Q into, say Q\ and Q2 such that u G Qi, we performed in our algorithm, 
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we have 



21 D 

IQil • |Q 2 | • A(Q) < — ■K(Q 1 ,Q 2 ). (1) 
59 

Let T[Q], T[Qi], and T[Q2] denote the subtree of T corresponding to Q, Qi, and Q 2 , respec- 
tively. As a consequence to we have 1Z(Tq 1 , Tq 2 ) < |Q x j • \Q 2 \ • 2 i+l < 4- |Q X | • \Q 2 \ • A(Q) < 
4 • Tjjp • 72.(Qi, Q2)- Since max{|Qi|, IQ2I} < |2|> by an inductive argument we have 1Z{Tq) = 
K(T Qi )+-R,(T Q2 )+K(T Qi ,Tq 2 ) < 4 • 2 -§ • (ft(Qi) + TZ(Q 2 ) + TZ(Q U Q 2 )) = A- 2 -§-lZ{Q). This 
holds for all cluster Q, including the trivial cluster in D$. Therefore 1Z(T) < 4 • ^ • 1Z(M). 

It remains to prove the inequality ([!]). Let {vi, v 2 , ■ ■ ■ , v q } be the set of vertices of Q such 
that d(u,vi) < d(u,v 2 ) < ... < d(u,v q ). Consider the following random distribution defined 



over /3e{[f|,|"2 1+1, 



Let us first derive a lower bound on ^2<i<5a HC(i), which is the total amount of interaction 

4 — — 4 

when cutting at the central | intervals. Due to space limit, preliminary material as well as proofs 
to the following lemmas are moved to the appendix for further reference. 



Lemma 1. We have 



!<<<!« V / i<t<f 

The following lemma proves the existence of a good cut and ([!]) . 
Lemma 2. We have 



min < E 



■ (q - P) ■ A(Q) 



. f 7 ■ (g - 7) ' A(Q) 7 ' (g - 7) ■ A(Q) 11 210 
i™<| \ ^C7( 7 ) ' ^C(g- 7 ) J J " 59 ' 



As a side-product, we have the following lemma, which states the existence of good cuts for 
any given point set and the correctness of inequality Q. 

Lemma 3 (1-Dimensional Point Set Cutting Lemma). Given a set of real numbers A = 
{a\, a 2 , . . . , a n }, a\ < a 2 < . . . < a n , there exists a cutting point z G R with a\ < z < a n 
such that the following holds. 

La(z) ■ (n - L A (z)) ■ A < do ■ ^ ^ (a, -a;), 

l<i<L A (z) L A (z)<j<n 



where La {z) = \{cl G A : a < z}| is the number of elements in A that are smaller than z, 

210 
59 



A = a n — a\ is the diameter of A, and 5q < 2 M is a constant. 



3.3 Lower Bound 

In the following, we derive a lower bound to Problem [T] we considered throughout this section. 
This is done by linking the basic structure of any optimal dominating tree metric to our point 
set cutting lemma, followed by deriving an upper bound to the performance of any cut. 

Let A = {a\, a 2 , . . . , a n } be a set of numbers, where a% = i for all 1 < i < n, and (A, d) be the 
corresponding metric extracted from A. Let (T, dx) be an optimal ultra-metric embedding of A 
in terms of distance- weighted average stretch. Without loss of generality, we can assume that 
T is a binary tree. Otherwise, we can always create dummy nodes to make T binary without 
changing its sum of pairwise distances. The following lemma characterizes the structure of T. 
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Lemma 4. Let Tl and Tr be the left-subtree and the right-subtree of T such that a\ € 
Tl- Then, there exists an integer k, 1 < k < n, such that Tl is an ultra-metric containing 
{ai, a2, . . . , afc} and Tr is an ultra-metric containing A\{a%, 0,2, ■ ■ ■ , at}- 

Therefore, to obtain a lower bound on the distance-weighted average stretch of any domi- 
nating tree metric of A, it suffices to consider the quality of the best cut we can possibly achieve 
on A. 

Lemma 5. Let 80 be a constant such that our point set cutting lemma holds, then 5o > 2. 

By Lemma [4] and Lemma [5j we obtain the following bound as claimed. 

Corollary 6. Let Ai = (V,d) be a given metric and D(Ai) be the set of dominating tree 
metrics of Ai . Then 

1Ilf M N~ ^ 2 - 

{V',d')&V(M) T,u,veV d ( u i v ) 

4 Approximating Euclidean Metrics by Their Spanning Trees 

In this section, we show how a spanning tree of small constant distance- weighted average stretch 
for a Euclidean graph can be computed in polynomial time. The basic idea is to extend the 
previous point-set decomposition. In order to guarantee a constant blow-up in the diameter of 
the resulting spanning tree, we cannot allow the cut to be made at arbitrary positions. Instead, 
we restrict each cut to be made within the central (1 — 2a) portion along the longest side of its 
bounding box, where a is a constant chosen to be This guarantees a balanced partition, an 
exponentially decreasing size of the bounding boxes, and a constant blow-up of the diameter of 
the resulting spanning tree. This is crucial in the analysis, as we need a tight diameter in order 
to provide a good upper-bound on the interaction between pairs separated by our cuts. On the 
other hand, we also have to guarantee the existence of good cuts in the central (1 — 2a) portion 
so that the overall interaction stays bounded. 

Given a set of points V in the Euclidean space lZ d of finite dimension, our algorithm recur- 
sively computes a rooted tree T with root r as follows. Let B(V) be the bounding box of V, and 
k be the index of dimension such that Ck{B{V)) = C max {B{V)). We consider the projection 
of the points to the /c 4?i -axis, and let ai,a,2, ■ ■ ■ ,a n , a\ < 0,2 < ■ ■ . < a n , be the corresponding 
coordinates. We apply our linear time algorithm^] to compute a decomposition for which the cut 
is restricted to be made inside the central (1 — 2a) portion, [a • (a\ + a n ), (1 — a) • (a\ + a n )\. 
See also Fig. [2] (a). Let V\ and V2 be the corresponding partitioned subsets of points. We 
compute recursively the two rooted trees for V\ and V2 , denoted by 71 with root r\ and T2 with 
root T2- The tree T is constructed by joining n and r2, and the root of T is chosen to be r\. 
A high-level description of our algorithm is provided in the appendix. 



|^ C max {B{T)) 



B(T) 






• • 




• 






• 


• 








a-C max (B(P)) 1 {l-a)-C m a*(B(P)) 

Figure 2: (a) The vertical cut is restricted to be placed in the central (1 
the longest side of the bounding box. (b) A possible decomposition and the u 
resulting tree. 



2a) portion along 
v path in the 



1 This algorithm is moved to § A. 4 for further reference due to space limit. 
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In the following lemma, we show that, in exchange of certain penalty in the performance 
factor that is inverse proportional to the length of the interval to which the cut is restricted, 
we can always guarantee a good and balanced decomposition. 

Lemma 7 (Constrained Point Set Cutting Lemma). Given a set of real numbers A = {a\, a^, • • • , a n }, 
a i < 0-2 < • • • < On and an interval X = [£, r] such that IC [a±, a n ], there exists a cutting point 
z € X such that the following holds. 



where La{z) = \{a E A : a < z}\ is the number of elements in A that are smaller than z and 
^0 < ^§ is a constant. 

In the following, we state the theorem and leave the rest detail in the appendix for further 
reference. 

Theorem 8. Given a set of points V in lZ d , we can compute in polynomial time a spanning 
tree T of V such that the distance- weighted average stretch of T with respect to V is at most 
165o • dVd, where 5o < ^ is the constant in our point set cutting lemma. 

5 Discussion and Open Problems 

We conclude with some remarks and conjectures. In this work, we provided both an upper 
bound and a lower bound to Problem [T] We conjecture the lower bound of two we provided 
to be tight. On the other hand, we also conjecture that similar result holds for approximat- 
ing arbitrary graph metrics by their spanning trees. However, as it seems not promising to 
guarantee the quality of the best cut for arbitrarily small restricted intervals, none of known 
graph decomposition techniques helps and either more powerful decomposition schemes or new 
techniques are expected. 
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A Approximating Arbitrary Metrics 
A.l The Algorithm 



Algorithm Hierarchical- Net-Decomposition(V, d) 
1: D s <- {F}, i<-6-l. 

2: while i > and -Dj+i has non-singleton clusters do 

3: for all non-singleton cluster V in Di+x do 

4: C (P)^{0}, S<-{7>}. 

5: while S ^ (ft do 

6: Let Q be an arbitrary cluster in S. 

7: if A(Q) < 2 l then 

8: Add Q to C("P) and remove Q from S. 

9: else 

Let u G Q be a vertex such that A U (Q) = A(Q). 

Let vi, V2, ■ ■ ■ , v q be the set of vertices in Q such that d(u, v\) < d(u, V2) < • . . < 

d(u,V q ). 

Let p, 1 < p < q, be the index such that p '^g^^ is minimized. 
Let Q! <- {vi,V2,...,v p }, S^SU{Q'}, and Q <- Q\Q'. 
end if 
end while 

Let C(P) be the refinement clusters of V in Dj. 
end for 

i 

end while 

Return the tree metric corresponding to -Do, D\, . . . , D^. 



Figure 3: A high-level description of the algorithm. 

A. 2 Analysis 
Lemma [Jl We have 

£ ^(0 >(!*"+§■ E *)• £ 

!<*<!? V §<?<*<! / |<fc<f 

Before proving Lemma|TJ let us derive a lower bound on the overall interaction X^i<i< g TtC(i). Recall 
that, TZC(i) = J2i<j<i J2i <3 <q d u{v 3 ,v k ) and d u (vj,v k ) = \d(u,Vj) - d(u,v k )\. For convenience, we will 
denote by l k the quantity d,Jv k , Vk+i), which is exactly d(u, Vk+%) — d(u,v k ), for each 1 < k < q. 

First, observe that, for each j, k with 1 < j < k < q, we have exactly (fc — j) duplications of the item 
d u (vj,v k ) in the summation J2i<i< q HC(i), i.e., it appears exactly once in IZC(i) for each j < i < k. 
Fherefore, after re-arranging the items we have 

£ UC{i)= E fc ' E d u (vi,v i+k ). 

l<i<q l<k<q l<i<g— fc 

Let f(q) = I J2i<i<i d u (Vi,Vi + i) if g is even and f(q) = otherwise. Then 
£ fc- E d u (vi,v i+k ) 

l<k<q l<i<q-k 

= E k ' E d u (vi,Vi + k)+ E fc ' E d u (vi,v l+k ) + f(q) 

l<fc<§ l<i<g-& l<k<q l<i<q-k 

= E fc ' E rf «( U i> W i+*0 + E E d u(Vt,V l+q ^k) + f(q), 

l</c<§ l<i<q-k l<fc<§ l<i<fe 
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d(u, vi ) d(u, V2) 
I I 

T 



d(u 7 V3) d(u, 1)4) 



/ / 



d(u,v k _i) d(u,v k ) 
I I 

T 



J-l— 



Figure 4: Alignment of the intervals when k = 3. The first group starts with d(u, v\) while the 
second and the third start with d{u,V2) and d(u, V3), respectively. 

where in the last inequality we substitute the variable k by q — k. By re-organizing and aligning the 
items from the above summation (see also Fig. [4]) , we have the following lemma. 

Lemma 9. For 1 < k < |_§J , we have 

^ d u (vi,v i+k ) = k ■ A(Q) - ^ {k - i) ■ (li + l q -i) = ^ d u (vi,v l+q - k ) 

l<i<q-k l<i<k l<i<k 

Proof of Lemma^ We prove the first half of this lemma, J2i<i< q -k dv{vi+k, Vi) — k- A(Q) — Xa<i<fc(^ — 
• {Hi + lq-i)- The second half, J2i<i<k dv(vi+ q -k,Vi) = k ■ A(Q) ~ Y^i<i<k( k ~ ' + V-»)> follows 
by a similar argument. Consider the alignments of the set of intervals which spans exactly k consecutive 
elements, that is, intervals [d(u,Vi),d(u,Vi+k)], f° r 1 < k < [|J. We have exactly k alignments, each 
starting with 2j for 1 < i < k. See also Fig. |4j This sums up to k ■ A(Q), except for exactly k — i times 
over-count of li and l q -i- □ □ 

We provide in the following lemma an overall estimate to the overall interaction, 53i<i< g 7ZC(i). 

Lemma 10. 

^^C7(z)> V E i-(£k+l q -k)+9(q), 

l<i<9 l<fe<| §-fc<i<§ 

where g{q) = q ■ X)i<i<2 * " ^| if 9 is even and g{q) = otherwise. 
Proof of Lemma {7(\ By the above discussion and Lemma |9j we have 

E nc{ $ 

l<i<q 

= k ' E d u (vi,v i+k )+ ^2 (i~ k ) E d «( u ij^+p-fe) +/(?) 

l<fc<f l<i<g-fc l<fc<§ l<i<fc 

= ^ g • J fc • A(Q) - £ (*-i)(4+V0 

i<fc<§ y i<i<fc 

For 1 < i < |, the coefficient of £j and l q -i in the above summation is q ■ ^2 i<k< a_(k — i), which equals 
q ■ X)i<fc<2-i & hy substituting the variable A; by fc — i. Therefore, we have 

Y, nc ^> E Q-k-A(Q)- i- E *-(4+Vfc)- 

l<i<g l<fe<| l<fc<§ l<i<f-ft 

Since A(Q) = 2i<i<g^'' further expanding A(Q), we obtain 

£]ftC(i)> ^ g- £ z-(4+ 

l<i<g l<fc<§ §_ fc<i< 2 

□ □ 

Now we are ready to prove Lemma [T] 

Proof of Lemma^ We divide the total interaction to be lower-bounded, X^<i< i q TZC(i), into three 
parts which we discuss below. 



11 



I. the interaction between points from 7 w p~|+ij ■ ■ ■ > u |5£j |- 



The situation is equivalent to computing the overall interaction for a point set of | points. By 
Lemma 10 with index replacement, the interaction is lower-bounded by Ek/c<2 2 ' Sa-/c<i<2 * ' 
(^| + fc+«32_ fc )+g'(g), where g'(q) = § 'Ei<i<2 *'^| ^ I ^ s even an< ^ d'il) = ® otherwise. Dropping 
the items corresponding to k < ^ from the first summation, we obtain | -J2^q<i<^ *'X)i<fe<?s 

For the remaining two cases, we consider the number of times each of the items from Ei<fc<2i ^fe 

contributes to Y\i^i^ 3 i HC(i). 

1 4 — 4 

II. the interaction between |i>i, 1*2, . . . , Urgi | and jui sjj , V\ zg.\ +1 , ■ ■ ■ , 

For each j, k such that 1 < j < §, ^ < A; < the pair d u (vj,Vk) contributes exactly once to the 
term lZC(i) for each i with | < i < There are such pairs, while there are | different terms 
in the final summation Ea<i<2 9 7ZC(i). Therefore, we obtain a lower bound of ^q 3 • E«<fc<22 ^ 
for this part. 

III. the interaction between { w [~2l 1 u rsl-|_i> • • • > ^1 2aJ } and other points. 

For any specific interval £ p with | < p < we consider the number of pairs between { w > +i> • ■ • ' w [5sJ 
and other points that contain this specific interval £ p . There arep— | points, j w |~2~| + i j • • ■ 1 v pp 

which lie to the left of v p and form pairs with points from 391 , v\ ■ ■ ■ that contain 

£ p . Similarly, the — p points that lie to the right of v p also form pairs with points from 
V2, ■ ■ ■ , j that contain £ p . Therefore there are | • (p — | + ^ — p) = | • | such pairs. 

This is true for all lZC(i) with | < i < Therefore £ p contributes | -| • | times in the summation 
and we obtain a lower bound of jg? 3 ■ ^2i<k<^ 

Summing up the bounds we obtained in the three parts and we have this lemma. □ □ 



Lemma [2j We have 

\f3-(q-(3)-A(Q) 



min < E 



nc{p) 



. f 7 ■ (g - 7 ) • A(Q) 7 ■ (g ~ 7) ■ A(Q) ]] 210 
i< n 7 <|l ftC( 7 ) ' ^C(?- 7 ) J J " 59 • 



Proof of Lemma^ This lemma holds trivially when g < 3. For q > 4, by the definition of expected 
values, we have 



E 



TZC(P) 



= J2 Pr{P = i] 



p-{q-[3)- A(Q) _ E|< 2 <^ »•(«-*)• A(Q) 



First we have 



E i(g-i)-A(Q)= u- E E 



f<i<: 



2<J< ; 



A(Q) < ^9 3 A(Q). 



f<tO 



Depending on whether or not E«<fc<?« ^£ ^ §gA(Q)j we distinguish between two cases. 
If X]s<k?s ^« ^ if A(Q), then, by Lemmalll we have 



E KC(i)> E + l E *] >^(Q) 
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i<fc<?2 



32 



35 



59 
96-6 
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and E 



P ■ (q - P) ■ A(Q) 



11 3 a / A 1 a 59 ,\ 210 

^96 9 A(2)/ U A(Q) '^6 9 J* ar 



On the other hand, if Y^i<i<i{^i + ^?-*) — 35^(2), then we have either Xa<i<§ &i — 35^(2), or 
Z)i<t<f - M^(2)- Without loss of generality, assume that J2i<i<z &i > Si<;<§ ^g-» - M A (2). 



>§A(Q) . <JA(Q) 



In this case, we have X)i<i<| ^ + X)§<»<^^» ^ Z)^<»< g ^»- Therefore £)^< i<g ^j < ^2 • Lct 
p be the smallest integer such that £ p > 0. Counting the interaction between {v\,V2, ■ ■ ■ ,v p } and 
{v p+ i,v p+2 ,...,v q }, we have TZC{p) >p- § • |§A(2) + p- § • |A(Q). Therefore, 

P'fa-P)-A(6) P-g- A(Q) = 210 

fcC7(p) - p . ? . A (Q)-(i-i| + i-|) 59- 

The argument for the case X)i<i<2 ^g-i > X)i<i<2 is analogous. This proves the lemma. □ □ 
A. 3 Lower Bound 

Let A = {ai, 0,2, ■ ■ ■ , a n } be a set of numbers, where <Zj = i for all 1 < i < n, and {A,d) be the 
corresponding metric extracted from A. Let (T, dx) be an optimal ultra-metric embedding of A in terms 
of distance-weighted average stretch. Without loss of generality, we can assume that T is a binary tree. 
Otherwise, we can always create dummy nodes to make T binary without changing its sum of pairwisc 
distances. The following lemma characterizes the structure of T. 

Lemma |4} Let Tl and Tr be the left-subtree and the right-subtree ofT such that a\ ETl- Then, there 
exists an integer k, 1 < k < n, such that Tl is an ultra-metric containing {ai, a%, ■ ■ ■ , cifc} and Tr is an 
ultra-metric containing A\{a±, 02, ■ ■ ■ , flfe}- 

Proof of Lemma^j\ If not, let I be the number of leaves in Tl, and denote by (p the permutation on 
{1,2,..., n} such that Tl is an ultra-metric containing {a^m, £W2)j • ■ • 7 a ip(e)}, where < (Wa) < 

■ • • < a <p(t)> an d T R is an ultra-metric containing {a v (i + i),a v ^ + 2), ■ ■ ■ ,a v ( n )}, where a v (t + i) < a^ l+2 ) < 
. . . < a v ( n y Note that by our assumption, a^m > a ip(£+i)- 

Construct a new ultra-metric To as follows. The structure of To is identical to T. For each leaf node 
in T that contains the singleton element, say a u , we put the element in the corresponding leaf 

node of To- The label of each internal node in T is set to be the diameter of the set of elements contained 
in the subtree rooted at it. 

For each i,j with l<i<j<£ or £<i<j<n, since i < j implies a^m < a v u) by the definition 
of ip, we have a v tj) — aWj) > j — i- Therefore the label of each internal node in To is no larger than 
that of the corresponding internal node in T. Furthermore, since a v (j\ > tw^+i) by assumption, we 
have at — (t\ < o> v m — a tp(i) an d a n — a^ +1 < a<p(„) — a v a+i)- Therefore, the labels of the roots of the 
left-subtree and the right-subtree of To are strictly smaller than the labels of their corresponding nodes 
in T. Hence we can conclude that TZ(T) < TZ(T), which is a contradiction to the optimality of T. □ □ 

Lemma [5]. Let So be a constant such that our point set cutting lemma holds, then So > 2. 

Proof of Lemma^ Consider the set of numbers A. Assume that we cut A at a point z € (afc, a*+i], for 
some 1 < k < n. The left-hand side of the inequality in our cutting lemma is k- (n — k) ■ (n— 1), while the 
right-hand side is Ylx<i<k^2k<j<nU ~ *) = \kn{n — k), where the equality follows from Equation |2| 
derived in § |A.4| Therefore we have 

^ k(n — k)(n — 1) n — 1 

~~ \kn(n — k) n ' 

which converges to 2 as n tends to infinity. Since this is true for all k with 1 < k < n, this lemma follows. 

□ □ 
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Corollary [6} Let M = (V,d) be a given metric and T>{AA) be the set of dominating tree metrics of Ai. 
Then 

E^ T , d'(u, v) 

mt v^" Ti \~ ^ 2 - 

(V'.d')eV(M) Y,u,vtV d \ U ' V ) 

Proof of Corollary^ This corollary follows directly from Lemma [4j Lemma [5j and induction on the size 
of A □ □ 

A. 4 Computing the Optimal Cut in Linear Time 

In this section, we show how the best cut can be computed efficiently in linear time. Let {a\, 02, ■ ■ ■ , a„}, 
a-i < a2 < ■ ■ ■ < a n , be the given set points. For each k with 1 < k < n, let LS(k) = Ei<i<fc ( a k ~ a i) 
and RS(k) — Efc<i<« ( a * — a k) be the sum of the distances between a k and the points to the left of at 
and the sum of distances between ak and the points to the right of ak , respectively. The first observation 
is that, for i < i < n, 

KC(i) = (n-i)-LS(i)+i-RS(i). (2) 
The following lemma shows how these quantities can be computed recursively. 

Lemma 11. For 1 < k < n — 1, We have 
• LS(k + 1) = LS(k) + Ei<*< fc 4, and 
. RS(k + l)=RS(k)-J2 k<i < n £ k . 

Proof of Lemma{Tl\ By definition, we have LS(k + 1) = Y^i<i<k+i (4 + a fe — a i) = LS(k) + J2i<i<k 4, 
and RS(k + 1) = E fe+ i< 2 <„ (o< - a k - £ k ) = RS(k) - E fc <i<„*fc. □ □ 

By Lemma 11 and Q, we can compute in linear time the values LS(k), RS(k), IZC(k) for all 
1 < k < n, and the optimal cut. For any given interval I C [ai,a n ], we can also compute the optimal 
cut inside I by the same approach. 

B Approximating Euclidean Metrics by Their Spanning Trees 

Algorithm Euclidean-Spanning- TreelV) 
Input: A set V of n points in lZ d . 

Output: A pair (T, r), which is a spanning tree T of V with root r. 
1: if V is a singleton point set containing point p then 
Return (V,p). 
end if 

Let a = \ be a constant. 

Let k be the index of dimension such that C^BiV)) = £, max (B(V)). 

Let a\ < a,2 < . . • < a n be the coordinates of the projection of V into k th dimension, labelled 
in sorted order. 

p = a ■ (ai + a n ), q = (1 - a) • (ai + a n ). 
{Vi,V2) < — ld-cut({a 1 ,a 2 , ■ . . ,a n }, \p,q]). 

(T\,ri) i — Euclidean- Spanning- Tree(Vi), (T 2 ,r 2) < — Euclidean-Spanning-Tree(J ) 2)- 
Let T«— TiUr 2 U{(ri,r 2 )}. 
Return (T, ri). 
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Figure 5: Algorithm for computing a spanning tree of low routing cost on Euclidean graphs. 

For convenience, let T be the collection of subsets of V which have occurred during the recursions. 
For any Q £ F, we denote by T[Q] the subtree of T corresponding to Q and e(Q) the edge connecting 
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the two rooted subtrees corresponding to the two further partitions of Q. e(Q) is defined to be a dummy 
self- loop with length zero if Q is a singleton set. The following lemma provides an upper-bound on the 
pairwise distances. 

Lemma 12. For any p,q £ P, we have dr(p, q) < \d\fd ■ C max (B(V)). 

Proof of Lemma\T^ Let A% D A 2 D ■ ■ ■ D A a , Ai £ T for 1 < i < a, be the subsets of V occurred during 
the recursions to which p belongs, and Bi D B 2 D ■ ■ . D B^, B } ■ E T for 1 < j < b, be the subsets to 
which q belongs. Note that A\ = B\ = V , A a = {p}, and Bi, — {q}. From the construction of T, we 
have 

dr(p,q)<d r[Al] (p, ri ) + \e(V)\+d r[Bl] (r 2 ,q)< £ \e(A t )\ + \e(V)\ + £ |e(S,)l, 

l<i<a l<j<b 

where r\ and r 2 are the roots of T[Ai] and T[Bi]. Since the longest straight-line distance inside a hyper- 
rectangle is bounded by its longest diagonal, we have \e(Q)\ < VdC max (B(Q)) for any subset Q € T. 
Furthermore, since we always cut along the longest side of the bounding box, we have C max {B(Ai + d)) < 
(1 - a)C max (B{Ai)) and C max {B{B J+d )) < (1 - a)C max (B(Bj)) for all 1 < i < a - d and 1 < j < b - d. 
Therefore, it follows that 

d r (p,q)< ^ max (B(Ai)) + VdC max (B(V)) + ^ VdC max (B(B,)) 

l<i<a l<J<b 

< 2d ■ - a) l C max {B{V)) + VdC max (B(V)) 

i>l 

< 2 dVd-C max (B(V)), 
a 

where in the second last inequality we collect every d items from the summation of the first inequality 
and then combine them together into a geometric series. □ □ 

Lemma [7j Given a set of real numbers A — {ai, a 2 , . . . , a„}, a\ < a 2 < . . . < a n and an interval 
X — [£, r] such that IC [ai, a n ], there exists a cutting point z£l such that the following holds. 

La{z) ■ (n- L A (z)) ■ \1\ < 5 ■ ^ X! {aj-a-i), 

\<i<L A {z) L A (z)<j<n 

where La(z) — \{a € A : a < z}\ is the number of elements in A that are smaller than z and So < * s 
a constant. 

Proof of Lemma^ We say that an interval degenerates if it has length zero. First we argue that, if there 
are degenerating intervals at 01, then it is always worse to cut at those degenerating intervals. Let fc, 
1 < k < n, be the largest index such that a\ = a% = . . . = a^. Observe that, for any i, j with 1 < i, j < fc, 
we have lZC(i) = * • IZC(j). On the other hand, for 1 < i < k and 1 < j < k — i, we have 

(i+j)(n -i-j) _ i(n - i) + j(n - 2i - j) i + j _ TZC{i+j) 
i(n — i) i(n — i) ~ i IZC(i) ' 

which implies that ^^c^i+j)^ — ncuj] an< ^ therefore cutting at (a^, ak+i] is always better than cutting 
at degenerating intervals at a\. Similarly, we can argue that, it is always worse to cut at the degenerating 
intervals at a„, if there is any. 

Now we argue that there will be a feasible cut satisfying the criterion. According to the given interval 
T = [a, b] and the point set A, we create a new point set B = {61, b 2 , ... , b n } as follows. 

if ai < £, 

For 1 < i < n, bi — {ai if I < < r, 

otherwise. 

Let z be the best cut of B in I. By the above argument, we have I < z < r and therefore 
L A (z) = L B (z). By Lemma|3j we have L B {z) ■ (n - L B {z)) ■ \I\ < ^§ I] b!<z < hj (& 3 - h)- According to 
our setting, we have (bj — bi)< (aj — a,;) for all 1 < i < j < n. Therefore L A (z) ■ (n — L A (z)) ■ \X\ < 
WSi< 4 <L A ( z )EL A ( z )<j<„( a j- a O as claimed. □ □ 
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Theorem |8j Given a set of points V in lZ d , Algorithm Euclidean-Spanning- Tree computes a spanning 
tree TofV such that the distance-weighted average stretch of T with respect to V is at most 16Sq ■ dy/d, 
where So < is the constant in our point set cutting lemma. 

Proof of Theorem^ If \P\ = 1, then this theorem holds trivially. Otherwise, by Lemma 
and the fact that the length of the restricted interval is (1 — 2a) ■ C max (B(V)), we have 

Kt{Vi,V 2 ) < \Vi\ ■ \V 2 \ ■ -dVd ■ £ max (B(T)) < n 2S \ r dVdn{Vx,V2). 

a a(l — la) 

This holds for all recursions. Choose a to be | and this theorem follows directly by induction on the 
depth of recursion. □ □ 



12 Lemma ^1 
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